Item-by-item results
The following section presents the item-by-item results of the analysis. Each item has several tables and
a figure. The figure, called a quantile plot, shows the proportion of examinees selecting each option, for
consecutive segments of the examinees as ranked by score. The key thing to evaluate in this figure is
that the line for the correct answer has a positive slope (goes up from left to right), which means that
examinees with higher scores tend to answer correctly more often. Conversely, the lines for the incorrect
options, called distractors, should have a negative slope. Note, however, that the use of a small number
of groups (e.g., 3 or fewer) oversimplifies the graph, so that items which are very difficult or very easy
(that is, discriminating in only the top or bottom 20% of examinees) might appear to have poor quantile
plots and classical statistics. For such items, item response theory presents significant advantages in
analysis
There are four tables presented for each item.
1. Item information table: records the information supplied by the control file (or Iteman 3 header) for this
item.
2. Item statistics table: overall item statistics.
3. Option statistics: detailed statistics for each item, which helps diagnose issues in items with poor
statistics.
4. Quantile plot data: the values used to create the quantile plot.
The item statistics table presents overall item statistics in the first row of numbers. The two most
important item-level statistics for dichotomously scored (correct/incorrect) items are the P value and the
point-biserial correlation, which represent the difficulty and discrimination of the item, respectively. For
polytomously scored (rating scale or partial credit) items, the difficulty is represented by the mean
(average) item score, while the discrimination is represented by a Pearson r correlation.
The P value is the proportion of examinees that answered an item in the keyed direction. P ranges from
0 to 1. A high value (0.95) means that an item is easy, a low value (0.25) means that the item is difficult.
The point-biserial correlation (Rpbis) is a measure of the discriminating, or differentiating, power of the
item. Rpbis ranges from -1 to 1. A negative Rpbis is indicative of a bad item as lower scoring
examinees are more likely than higher scoring examinees to respond in the keyed direction.
For rating scale or partial credit items, the mean item score ranges from the minimum to the maximum of
the scale. For example, if the item has a rating scale of 1 to 5, the possible range for the mean is 1 to 5.
The Pearson r is similar to the Rpbis in that it ranges from -1 to 1, with a positive r indicating that the item
correlates well with total score.
The option statistics table presents statistics for each individual option (alternative). The key thing to
examine in this portion of the table is that no distractors have a higher Rpbis than the correct answer.
That indicates that higher scoring examinees are selecting the incorrect answer, which therefore might be
arguably correct.
The quantile plot data table simply presents the values calculated to create the quantile plot. Because it
contains the same information, the quantile plot itself presents a useful picture of the item's performance,
but this table can be used to examine that performance in detail to help diagnose possible issues.